Search Data Structure
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to practical disciplines (includi ...
, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a
database In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases s ...
. The simplest, most general, and least efficient search structure is merely an unordered sequential
list A ''list'' is any set of items in a row. List or lists may also refer to: People * List (surname) Organizations * List College, an undergraduate division of the Jewish Theological Seminary of America * SC Germania List, German rugby unio ...
of all the items. Locating the desired item in such a list, by the linear search method, inevitably requires a number of operations proportional to the number ''n'' of items, in the
worst case In computer science, best, worst, and average cases of a given algorithm express what the resource usage is ''at least'', ''at most'' and ''on average'', respectively. Usually the resource being considered is running time, i.e. time complexity, b ...
as well as in the average case. Useful search data structures allow faster retrieval; however, they are limited to queries of some specific kind. Moreover, since the cost of building such structures is at least proportional to ''n'', they only pay off if several queries are to be performed on the same database (or on a database that changes little between queries). Static search structures are designed for answering many queries on a fixed database; dynamic structures also allow insertion, deletion, or modification of items between successive queries. In the dynamic case, one must also consider the cost of fixing the search structure to account for the changes in the database.


Classification

The simplest kind of query is to locate a record that has a specific field (the ''key'') equal to a specified value ''v''. Other common kinds of query are "find the item with smallest (or largest) key value", "find the item with largest key value not exceeding ''v''", "find all items with key values between specified bounds ''v''min and ''v''max". In certain databases the key values may be points in some multi-dimensional space. For example, the key may be a geographic position (
latitude In geography, latitude is a coordinate that specifies the north– south position of a point on the surface of the Earth or another celestial body. Latitude is given as an angle that ranges from –90° at the south pole to 90° at the north pol ...
and
longitude Longitude (, ) is a geographic coordinate that specifies the east– west position of a point on the surface of the Earth, or another celestial body. It is an angular measurement, usually expressed in degrees and denoted by the Greek lette ...
) on the
Earth Earth is the third planet from the Sun and the only astronomical object known to harbor life. While large volumes of water can be found throughout the Solar System, only Earth sustains liquid surface water. About 71% of Earth's surfa ...
. In that case, common kinds of queries are "find the record with a key closest to a given point ''v''", or "find all items whose key lies at a given distance from ''v''", or "find all items within a specified region ''R'' of the space". A common special case of the latter are simultaneous range queries on two or more simple keys, such as "find all employee records with salary between 50,000 and 100,000 and hired between 1995 and 2007".


Single ordered keys

*
Array An array is a systematic arrangement of similar objects, usually in rows and columns. Things called an array include: {{TOC right Music * In twelve-tone and serial composition, the presentation of simultaneous twelve-tone sets such that the ...
if the key values span a moderately compact interval. *Priority-sorted list; see linear search *Key-sorted array; see
binary search In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the ...
*
Self-balancing binary search tree In computer science, a self-balancing binary search tree (BST) is any node-based binary search tree that automatically keeps its height (maximal number of levels below the root) small in the face of arbitrary item insertions and deletions.Donal ...
*
Hash table In computing, a hash table, also known as hash map, is a data structure that implements an associative array or dictionary. It is an abstract data type that maps keys to values. A hash table uses a hash function to compute an ''index'', ...


Finding the smallest element

* Heap


Asymptotic worst-case analysis

In this table, the asymptotic notation ''O''(''f''(''n'')) means "not exceeding some fixed multiple of ''f''(''n'') in the worst case." ''Note'': Insert on an unsorted array is sometimes quoted as being ''O''(''n'') due to the assumption that the element to be inserted must be inserted at one particular location of the array, which would require shifting all the subsequent elements by one position. However, in a classic array, the array is used to store arbitrary unsorted elements, and hence the exact position of any given element is of no consequence, and insert is carried out by increasing the array size by 1 and storing the element at the end of the array, which is a ''O''(1) operation. Likewise, the deletion operation is sometimes quoted as being ''O''(''n'') due to the assumption that subsequent elements must be shifted, but in a classic unsorted array the order is unimportant (though elements are implicitly ordered by insert-time), so deletion can be carried out by swapping the element to be deleted with the last element in the array and then decrementing the array size by 1, which is a ''O''(1) operation. This table is only an approximate summary; for each data structure there are special situations and variants that may lead to different costs. Also two or more data structures can be combined to obtain lower costs.


Footnotes

{{reflist


See also

*
List of data structures This is a list of well-known data structures. For a wider list of terms, see list of terms relating to algorithms and data structures. For a comparison of running times for a subset of this list see comparison of data structures. Data types ...
*
Skip list In computer science, a skip list (or skiplist) is a probabilistic data structure that allows \mathcal(\log n) average complexity for search as well as \mathcal(\log n) average complexity for insertion within an ordered sequence of n elements. T ...
Data structures